Speech Recognition Based on Feature Extraction with Variable Rate Frequency Sampling
نویسندگان
چکیده
Most feature extraction techniques involve in their primary stage a Discrete Fourier Transform (DFT) of consecutive, short, overlapping windows. The spectral resolution of the DFT representation is uniform and is given by Δf=2π/Ν where N is the length of the window The present paper investigates the use of non-uniform rate frequency sampling, varying as a function of the spectral characteristics of each frame, in the context of Automatic Speech Recognition. We are motivated by the non-uniform spectral sensitivity of human hearing and the necessity for a feature extraction technique that autofocuses on most reliable parts of the spectrum in noisy cases.
منابع مشابه
The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate
This paper first introduces a newly-recorded high quality Romanian speech corpus designed for speech synthesis, called “RSS”, along with Romanian front-end text processing modules and HMM-based synthetic voices built from the corpus. All of these are now freely available for academic use in order to promote Romanian speech technology research. The RSS corpus comprises 3500 training sentences an...
متن کاملImproving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملSignal Modeling with Non-uniform Time Sampling of Features for Automatic Speech Recognition
SIGNAL MODELING WITH NON-UNIFORM TIME SAMPLING OF FEATURES FOR AUTOMATIC SPEECH RECOGNITION Montri Karnjanadecha Old Dominion University, 2000 Director: Dr. Stephen A. Zahorian This dissertation presents an investigation of nonuniform time sampling methods for spectral/temporal feature extraction in speech. Frame-based features were computed based on an encoding of the global spectral shape usi...
متن کاملTechniques for capturing temporal variations in speech signals with fixed-rate processing
Fixed-rate feature extraction which is used in most current speech recognizers is equivalent to sampling the feature trajectories at a uniform rate. Often this sampling rate is well below the Nyquist rate and thus leads to distortions in the sampled feature stream due to aliasing. In this paper we explore various techniques, ranging from simple cepstral and spectral smoothing to ltering and dat...
متن کاملSpeech recognition at multiple sampling rates
A feature extraction scheme is presented that analyzes speech signals sampled at different sampling rates. This will be needed in the future because of terminals in the telecom network that will transmit speech information also in the frequency region above 4 kHz. A cepstral analysis scheme is applied in the frequency range up to 4 kHz to create a common set of acoustic parameters for all sampl...
متن کامل